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Much  of  our  research  has  been  focused  on  the  implications  of  the 


particle/extended-body  motion  distinction  for  how  people  reason  about  simple  dynamical 
events  (Proffitt  &  Gilden,  1989).  A  reoccurring  theme  is  that  people  differentially 
appreciate  the  dynamical  significance  of  translational  and  rotational  motions;  they 
possess  a  far  better  understanding  of  translations.  Imagine,  for  example,  that  you  are 
observing  the  behavior  of  a  toy  top.  Whether  or  not  a  top  is  spinning,  it  fells  straight 
down  when  dropped  (translational  context);  however,  if  placed  on  a  pedestal  (rotational 
context)  its  behavior  is  influenced  by  its  spin.  If  it  is  not  spinning,  then  it  fells  off  the 
pedestal,  whereas  if  the  top  is  spinning,  then  it  precesses.  The  behavior  of  a  top  in  free 
fell  is  easily  assimilated  by  common  sense,  whereas  the  precession  of  a  spinning  top 
balanced  on  a  pedestal  is  not  In  the  latter  case,  we  are  amused  because  the  spinning 
top  looks  like  it  ought  to  fall  even  though  our  present  and  past  experience  with  the  toy 
informs  us  that  it  does  not  We  have  investigated  people’s  common-sense 
understandings  of  rotational  dynamics  and  found  them  to  be  profoundly  muddled 
relative  to  their  intuitions  about  translational  dynamics  (for  reviews  see  Gilden,  1991; 
Proffitt  &  Gilden,  1989).  Two  findings  are  of  special  interest  First,  reasoning  about 
rotational  dynamics  is  not  much  improved  by  viewing  ongoing  events  (Kaiser,  Proffitt, 
Whelan,  &  Hecht,  1992).  Second,  people  who  have  explicit  knowledge  about  rotational 
dynamics  (physics  teachers)  or  who  have  considerable  experience  observing  their 
behavior  (bicycle  racers,  professional  billiards  players)  have  essentially  the  same 
common-sense  intuitions  about  rotational  dynamics  as  do  novices  (Proffitt,  Kaiser,  & 
Whelan,  1990;  Hecht,  1992). 
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The  difference  in  people's  understandings  of  translational  versus  rotational  events 
is  due,  in  large  part,  to  differences  in  the  representations  that  are  demanded  from  the 
observer  (Gilden,  1991;  Proffitt  &  Gilden,  1989).  These  differential  demands  reflect 
both  mathematical  and  physiological  constraints  on  motion  processing. 

Mathematical  and  Physiological  Constraints  on  Motion  Representation 
Axis  constraints 

Translations  and  rotations  can  be  distinguished  in  terms  of  what  can  be 
represented  locally  on  the  basis  of  correlation  in  space-time.  Every  point  on  a 
translating  object  undergoes  the  same  motion;  thus,  if  the  velocity  of  some  point  on  an 
object  is  detected,  then  the  motion  of  the  whole  object  is  known.  This  is  not  the  case 
for  rotations.  Detecting  the  instantaneous  velocity  of  a  point  on  a  rotating  object 
provides  very  little  information  about  the  object’s  motion.  The  direction  of  rotation 
cannot  be  determined  from  knowing  the  tangential  velocity  at  a  point  without  further 
specification  of  the  position  of  the  point  relative  to  the  axis  of  rotation.  Neither  can  the 
magnitude  of  angular  velocity  be  determined  from  a  local  velocity  vector  because  a 
point’s  velocity  is  due  to  two  factors:  angular  velocity  and  the  point’s  distance  from  the 
axis  of  rotation.  The  axis  of  rotation  is  implicated  in  both  direction  and  angular  speed 
and  local  motion  detection  cannot  establish  an  axis.  Translations  are  the  only  motions 
that  can  be  represented  without  the  specification  of  an  axis.  The  manner  in  which 
translations  differ  from  all  other  motions,  entails  the  following  three  discussions:  (1)  a 
review  of  formal  models  of  motion  mechanisms,  (2)  an  analysis  of  what  can  be 
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represented  by  these  mechanisms,  and  (3)  a  description  of  the  geometry  of  motion 
fields. 

Formal  Models  of  Motion  Mechanisms.  It  is  well  known  that  there  is  a  class  of 
cells  both  in  and  outside  of  primary  visual  cortex  (VI,  area  17)  that  are  selectively  tuned 
for  direction  of  translational  motion  (see  Movshon,  Adelson,  Gizzi,  &  Newsome  (1983) 
for  a  review).  There  is  increasing  evidence  that  direction  selectivity  in  these  cells  can  be 
successfully  modeled  through  the  linear  combinations  of  spacetime  separable  filters 
(Adelson  &  Bergen,  1983;  Watson  &  Ahumada,  1983).  Recent  modifications  of  this 
basic  model  have  incorporated  contrast  gain  control  and  an  expansive  power  law 
response  (Albrecht  &  Geisler,  1991)  without  changing  the  internal  logic  of  the  direction 
selective  mechanism. 

All  models  of  motion  detection  make  use  of  the  basic  notion  of  correlation  in 
space-time  (c.f.,  Adelson  &  Bergen,  1983).  The  minimum  definition  of  a  motion  unit  is 
that  it  connects  the  appearance  of  contrast  at  point  (x,t)  with  a  second  appearance  at 
point  (x+dx,T+dT).  In  a  spacetime  plot  these  two  points  are  displaced  diagonally  and 
we  shall  refer  to  their  connection  by  a  motion  unit  as  a  diagonal  correlation.  Intrinsic 
to  the  design  of  these  simple  detectors  is  their  locality.  Formally,  there  are  several 
senses  in  which  these  units  are  local.  First,  the  spatiotemporal  interval  (dx,dT)  might  be 
small,  in  which  case  the  units  are  local  in  the  sense  of  neighborhood.  However,  there  is 
another  sense  in  which  these  units  are  local  which  is  of  greater  concern  here.  Primitive 
motion  units  correlate  only  two  points  in  space-time,  not  three  or  more.  Formal  modeb 
of  motion  units  in  VI  that  link  (x,t)  with  (x+dx,T+dT)  are  not  conditionalized  upon 
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activity  at  any  other  point  (y,t).  In  this  sense,  these  motion  units  are  also  local  in  the 
sense  of  identity;  they  correlate  corresponding  appearances  of  the  same  thing.  Now 
motion  units  may  be  linked  in  excitatory  or  inhibitory  ways,  and  there  may  be  various 
types  of  spatial  pooling.  Regardless  of  these  additional  complexities,  single  motion  units 
appear  to  be  logically  constructed  around  the  notion  of  individual  correspondence  or 
diagonal  correlation. 

Representation  bv  Correlation.  Formal  models  of  motion  provide  a  useful  point 
of  departure  for  understanding  the  processing  of  rotation  direction  and  rotation  speed. 
Correlation  mechanisms  are  essentially  designed  to  respond  to  drifting  contrast  and  for 
this  reason  cannot  compute  rotation  sign  unambiguousty.  This  is  clear  from  the 
geometry  of  rotation;  the  instantaneous  direction  of  any  part  of  the  rotating  object 
depends  on  the  rotation  phase.  A  single  motion  unit  cannot  unconfound  rotation  phase 
from  rotation  sign.  In  order  for  clockwise  to  be  distinguished  from  counterclockwise,  it 
is  necessary  that  two  or  more  motion  units  be  linked  together. 

The  sort  of  linkage  that  is  required  for  the  computation  of  sign  of  rotation  is 
more  complex  than  spatial  pooling  over  the  outputs  of  individual  motion  units.  The 
kind  of  linkage  that  is  required  here  must  recognize  that  there  is  an  axis  of  rotation  and 
that  this  axis  induces  a  coupling  between  spatial  layout  and  drifting  contrast  For 
definiteness,  consider  the  case  of  a  needle  that  is  rotating  clockwise  about  its  center  and 
that  is  instantaneously  vertical.  In  this  case,  the  linkage  must  represent  the  coupling 
that  the  top  half  is  moving  to  the  right  and  the  bottom  half  is  moving  to  the  left  These 
interaction  terms  (couplings)  are  introduced  by  the  axis  of  rotation  and  in  principle 
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cannot  be  rendered  by  simply  summing  over  the  outputs  of  correlators. 

In  addition,  a  rotating  point’s  tangential  speed  confounds  the  magnitude  of 
angular  speed  with  the  point’s  distance  from  the  axis.  In  the  case  of  translation, 
detecting  the  speed  of  one  point  is  sufficient  to  specify  the  speed  for  the  whole  object 
Detecting  the  instantaneous  speed  for  a  point  on  a  rotating  object  does  not  provide 
information  about  motions  of  other  object  points;  their  linear  speeds  depend  upon 
where  they  are  located  relative  to  the  axis  of  rotation.  The  specification  of  angular 
speed  from  linear  speed  implicitly  relies  upon  the  representation  of  spatial  layout  rich 
enough  to  define  an  axis.  Thus,  angular  speed  cannot  be  discerned  from  a  local 
space/time  correlations. 

It  should  be  noted  that  there  are  cells  specifically  tuned  for  direction  of 
rotational  motion  that  have  been  isolated  in  the  medial  superior  temporal  (MST)  area 
of  monkey  (Sakata,  Shibutani,  Ito,  and  Tsurugai,  1986;  Tanaka  and  Saito,  1989;  Tanaka, 
Fukada,  and  Saito,  1989).  However,  these  cells  occur  relatively  late  in  visual  processing 
compared  with  the  direction  selective  units  that  we  consider  here.  Furthermore,  the 
receptive  fields  of  these  cells  are  quite  large,  of  order  40  to  80  degrees  and  it  is  not  at 
all  clear  what  relation  they  have  to  the  perception  of  local  rotation.  These  cells,  could 
not,  for  example,  process  the  types  of  motion  arrays  that  we  propose  to  use  in  our 
studies  where  individual  elements  often  subtend  less  than  a  degree  of  arc.  Rotation 
selective  cells  in  MST  seem  to  be  more  related  to  detecting  the  retinal  flow  associated 
with  head  tilt  than  to  object  rotation. 

The  Geometry  of  General  Motion  Fields.  The  comments  made  here  regarding 
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rotation  axes  generalize  to  any  motion  field  that  is  organized  by  a  geometric  element 
Representation  of  motion  direction  and  magnitude  within  such  fields  will  always  require 
the  coupling  of  spatial  relations  derived  from  the  axis  into  local  motion  directions  and 
amplitudes.  These  couplings  cannot  be  represented  by  additive  poolings  over 
mechanisms  that  respond  to  local  energy  drift  There  is  a  taxonomy  of  motion  fields 
that  is  supplied  by  the  calculus  of  vector  fields  (summarized  by  Koenderink  (1986)) 
which  shows  that  all  possible  motion  fields  other  than  translations  are  organized  by 
geometric  elements.  All  motion  fields  are  composed  from  the  following  four 
transformations  which  are  illustrated  in  Figure  1: 


Figure  1 


1.  Pure  translation:  the  rigid  displacement  of  texture.  This  is  the  only  type  of 
flow  that  is  not  organized  by  a  spatially  distinct  geometric  element  such  as  a  point  or  an 
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axis.  Translation  may  be  formally  viewed  as  a  rotation  about  an  axis  at  infinity,  but 
even  so,  all  translations  will  have  this  axis  in  common.  Different  translations  are  not 
individuated  by  their  having  different  spatially  located  axes  or  points. 

2.  Pure  divergence:  the  expansion  or  contraction  of  texture  about  a  point 

3.  Pure  curl:  rotation  about  an  axis,  shear  flow  separated  by  a  line. 

4.  Pure  deformation:  expansion  about  one  axis,  contraction  about  an  orthogonal 
axis.  This  case  is  distinguished  from  pure  divergence  in  that  the  flow  is  volume 
preserving. 

Thus,  translation  motion  fields  are  singled  out  as  the  unique  and  single  instance 
where  sign  and  magnitude  can  be  represented  bv  correlation.  The  other  motion  types 
require  the  spatial  localization  of  a  geometric  element  i.e.  the  localization  of  a  line  or  a 
point 

Ordering  Constraints 

There  is  an  additional  distinction  between  translation  and  rotation  that  appears 
to  be  relevant  to  perceptual  and  cognitive  appreciations  of  these  types  of  motion.  The 
simplest  way  to  state  this  distinction  is  that  when  an  object  translates  it  goes  somewhere 
and  when  it  rotates  it  does  not  Formally  this  distinction  means  that  translation 
generates  an  ordered  group  in  displacement  while  rotation  is  not  If  we  denote  a  point 
of  departure  by  "0"  and  translation  over  some  specified  distance  by  T(x),  then  0  <  T(0) 
<  T(T(0))  and  so  on.  Rotation  does  not  generate  a  similar  hierarchy  of  inequalities 
because  orientation  is  not  globally  ordered,  eventually  the  displacement  exceeds  180° 
and  for  some  x,  R(R(x))  <  R(x).  Although  rotation  is  ordered  locally  for  restricted 
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angles  of  rotation,  and  this  ordering  can  be  extended  globally  by  defining  angles  that 
exceed  180°,  perceptually  this  extension  is  not  meaningful.  The  simple  observation  that 
rotation  in  a  single  direction  accumulates  modulo  180°,  whereas  translation  in  a  single 
direction  accumulates  continuously  has  the  following  associated  consequences: 

1.  Rotation  always  generates  a  bounded  flow  field  and  when  it  is  about  an  axis 
internal  to  the  body  it  generates  a  flow  field  that  has  a  size  on  the  order  of  the  size  of 
the  body.  Translation  generates  unbounded  flow. 

2.  Rotations  of  opposite  sign  can  map  object  texture  to  the  same  position. 
Translations  of  opposite  sign  always  map  texture  into  different  positions. 

3.  Finite  sized  objects  may  have  symmetries  that  prevent  rotation  from  being 
detected.  Finite  sized  objects  never  have  symmetries  that  interfere  with  the  detectability 
of  translation. 

These  ideas  have  been  articulated  in  a  different  context  by  Proffitt  and  Cutting 
(1980).  They  distinguished  between  object  rotations  and  translations  in  terms  of  form 
and  motion  analysis.  In  essence,  they  argued  that  the  perceptual  system  uses  rotations 
to  define  what  the  object’s  3-D  structure  is,  whereas  translations  are  used  to  specify 
where  it  is  going.  We  make  essentially  the  same  argument  here.  That  because 
translation  accumulates  and  rotation  does  not,  translation  is  a  more  usable  source  of 
information  for  appreciating  event  kinematics. 

These  inherent  differences  between  translations  and  rotations  have  implications 
for  all  levels  of  processing.  We  have  investigated  these  implications  in  the  areas  of 
detection,  memory,  imagination,  and  reasoning. 
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Implications  of  the  constraints  for  attention 

Julesz  and  Hesse  (197U)  provided  initial  evidence  that  sign  of  rotation  is 
processed  serially.  This  claim  was  based  on  the  observation  that  a  field  of  rotating 
needles  divided  into  regions  based  on  sign  of  rotation  (clockwise  or  counterclockwise) 
does  not  effortlessly  segment  and  perceptual  boundaries  do  not  form  between  regions. 
Subsequently,  Gilden  and  Kaiser  (1992)  conducted  a  visual  search  experiment  using 
reaction  time  to  probe  processing  time  as  a  function  of  the  number  of  rotating  elements. 
They  found  evidence  for  a  serial  process  and  that  rotating  needles  require  about  30 
msec  apiece  for  sign  recognition.  The  evidence  that  direction  of  rotation  is  processed 
serially  is  in  sharp  contrast  with  the  finding  by  Nakayama  and  Silverman  (1986)  that 
translation  direction  is  processed  in  parallel.  Nakayama  and  Silverman  employed  a 
standard  search  paradigm  by  assessing  reaction  time  as  a  function  of  the  number  of 
motion  elements.  Their  motion  elements  were  sinusoidal  waveforms  contained  within 
stationary  apertures.  Periodic  boundary  conditions  were  imposed  so  that  deletion  of  the 
grating  on  one  side  of  the  aperture  was  coincident  with  reintroduction  of  the  grating  on 
the  opposite  side.  They  found  that  the  perception  of  a  field  of  translating  gratings, 
divided  into  regions  based  on  direction,  will  effortlessly  segment  and  form  distinct 
boundaries.  We  have  verified  this  conjecture  in  numerous  simulations  (Gilden  and 
Kaiser,  1992). 

The  distinction  between  translation  and  rotation  in  processing  style  should  not  be 
limited  to  the  perception  of  sign  or  direction.  We  have  conducted  a  pilot  study  in  which 
it  was  demonstrated  that  rotation  speed  is  processed  in  parallel  in  the  following  sense:  a 
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fast  rotating  element  will  pop-out  in  a  field  of  slow  rotating  elements,  but  not  vice  versa. 
(This  asymmetry  in  magnitude  is  common  to  many  visual  attributes  [Treisman  and 
Souther,  1985],  and  it  was  also  found  to  hold  for  translating  gratings.)  In  this  pilot 
study,  all  of  the  rotating  elements  were  exactly  the  same  size.  If  size  is  equated,  then 
the  angular  velocity  at  the  objects’  boundaries  is  no  longer  confounded  with  distance 
from  the  axis. 

The  collection  of  empirical  results  suggests  the  following  theorem:  Only  those 
attributes  of  motions  that  are  representable  by  diagonal  correlation  are  processed  in 
parallel.  A  geometric  version  of  this  theorem  is  that  a  motion  attribute  is  processed  in 
parallel  if  and  only  if  the  attribute  does  not  require  reference  to  a  point  (axis),  line,  or 
plane  that  organizes  the  optic  flow. 

Implications  of  the  constraints  for  memory. 

One  of  the  most  obvious  yet  significant  differences  between  translations  and 
rotations  is  that  the  former  has  a  perceptible  accumulation  in  displacement,  whereas  the 
latter  typically  does  not  Only  rotations  of  less  than  360°  have  a  noticeable  orientation 
displacement;  continuous  rotations  are  cyclic,  and  thus,  their  accumulation  cannot  be 
appreciated  without  counting.  Imagine  that  you  are  observing  a  rolling  wheel  The 
wheel  translates  from  here  to  there  as  it  rotates.  This  displacement  of  the  wheel  is  an 
observable  product  of  its  motion.  The  number  of  revolutions  incurred  during  the 
excursion  cannot  be  appreciated  without  attention  to  the  cycling  of  some  feature  on  the 
wheel  and  a  counting  of  its  cycles.  In  everyday  situations,  why  would  anyone  want  to  do 
this?  It  is  our  contention  that  people  pay  very  little  attention  to  rotations,  and  for  this 
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reason,  they  are  poorly  remembered. 

In  earlier  research  we  have  noted  that  people  have  little  understanding  of 
rotational  dynamics  relative  to  their  understanding  of  translational  dynamics  (Proffitt  & 
Gilden,  1989;  Proffitt,  et  al,  1990).  Surprisingly,  this  lack  of  understanding  is  not 
generally  characterized  by  biases  or  misinformation.  Rather,  people  appear  to  have  no 
commitments  as  to  why  things  rotate  the  way  they  do.  If  pressed  into  explaining,  say, 
why  a  top  precesses,  they  will  concoct  some  form  of  explanation,  but  their  confidence  in 
the  explanation  is  generally  quite  low  and  is  given  only  because  it  was  demanded.  Such 
explanations  are  not  part  of  the  corpus  of  beliefs  that  people  live  with,  they  are  made 
up  on  the  spot  In  contrast,  there  is  a  systematicity  to  both  the  erroneous  and  correct 
ideas  that  people  evince  about  translational  motions  (McCloskey,  Caramazza,  & 

Greene,  1980;  Kaiser,  Jonties  &  Alexander,  1986).  Thus,  the  distinction  between  what 
people  know  about  translation  and  rotation  is  not  measured  by  magnitude.  The 
distinction  is  deeper,  people  are  not  meaningfully  engaged  with  the  dynamical 
consequences  of  rotational  motion  -  for  the  most  part  they  do  not  care  about  it 

Consider  an  experiment  discussed  by  Proffitt,  et  al  (1990):  When  shown  a  pair  of 
wheels  (say  one  large  and  one  small)  and  asked  about  which  will  roll  faster  down  an 
inclined  plane,  people  behave  as  if  they  have  no  experience  with  rolling  objects.  This  is 
as  true  for  physicists  as  it  is  for  undergraduates  in  psychology  courses.  Of  course  a 
physicist  can  deduce  the  correct  answer  from  the  equations  of  motion;  the  point  is  that 
the  physicist  does  not  know  the  answer  until  the  equations  are  solved.  Is  it  the  case 
that  people  do  not  have  experience  with  rotation  and  rolling  wheels?  Surely  not  A 
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better  hypothesis  is  that  people  do  not  pay  attention  to  rotation  and  so  behave  as  if  they 
have  little  experience.  Recent  experiments  by  Hecht  (1993)  showing  that  people  neglect 
rotation  in  making  naturalness  judgments  about  rotating  wheels  provides  further 
evidence  for  this  point  of  view.  The  consequence  of  not  paying  attention  to  rotation  is 
that  it  is  not  encoded,  there  is  no  perceptual  learning,  no  formation  of  expertise,  nor 
any  of  the  concomitant  experiential  benefits  associated  with  memory. 

Attentional  and  ordering  constraints  couple  into  each  other  in  providing  a 
coherent  account  for  why  people  do  not  encode  their  experiences  with  rotation.  The 
cyclic  nature  of  rotation  makes  it  inconsequential  for  the  important  task  of  determining 
where  an  object  is  going.  In  general  rotation  does  not  provide  meaningful  or  useful 
information.  Most  rotations  arise  simply  as  a  product  of  initial  conditions,  say  when  a 
dropped  or  falling  object  acquires  some  initial  angular  momentum.  In  this  sense  it  is 
desirable  to  ignore  rotation.  The  cyclic  nature  of  rotation  also  makes  it  a  sink  for 
attentional  resources.  Thus,  it  is  also  the  case  that  certain  aspects  of  rotation  can  in 
fact  be  ignored.  Rotation  direction,  for  example,  does  not  pop-out  In  contrast  it  is 
not  possible  to  ignore  direction  of  translation;  the  flip  side  of  preattention  is  that  the 
voluntary  aspects  of  attention  do  not  mediate  the  processing  of  the  information.  People 
therefore  experience  a  certain  harmony  regarding  rotation  and  translation; 
consequential  information  is  delivered  at  no  cost  and  inconsequential  information  is 
ignorable. 

Implications  of  the  constraints  for  imagined  motions 

The  theoretical  account  of  motion  processing  that  we  have  developed  is  based  on 
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the  idea  that  axes  cannot  be  represented  by  motion  mechanisms  that  effect  diagonal 
correlation.  The  representation  of  an  axis  requires  some  elaboration  of  spatial  layout;  a 
form  analysis  that  is  distinct  from  just  detection  of  motion.  An  axis  of  rotation  and  the 
representation  of  its  sign  minimally  requires  such  primitive  notions  as 
top-moving-rightwards  or  bottom-moving-leftwards.  These  spatial-motion  interactions 
are  precisely  what  cannot  be  achieved  by  diagonal  correlation. 

Spatial  relations  such  as  top  and  bottom  are  only  well-defined  within  a  given 
frame  of  reference.  They  are  relative  terms  and  implicitly  refer  to  the  axis  that  gives 
them  definition.  Representation  of  an  axis  always  implies  the  establishment  of  a  frame 
of  reference.  The  serial  nature  of  axis  sign  processing  is  essentially  a  statement  that  axis 
frames  are  exclusive,  only  one  can  be  processed  at  a  time.  All  translational  motions  can 
be  represented  within  a  single  frame  of  reference,  whereas  representing  rotations 
requires  a  uniquely  specified  axis  for  every  rotation.  We  shall  consider  cognitive 
understandings  in  environments  where  an  axis  frame  must  be  represented  in  conjunction 
with  other  frames  of  reference. 

The  simplest  form  of  the  conjuration  of  an  axis  defined  frame  with  a  secondary 
frame  is  encountered  in  the  case  of  a  ./heel  that  rolls  without  slipping.  The  rotational 
motion  defines  an  axis  that  itself  moves  in  a  background  environmental  frame.  The 
wheel's  configuration  is  defined  within  a  coordinate  system  that  revolves  about  the 
rotational  axis  with  the  wheel.  Imagine  a  rolling  clock;  its  top  (12  o'clock)  is 
continuously  changing  its  position  relative  to  the  environment  but  not  relative  to  its 
object-centered  reference  frame.  The  secondary  frame,  which  we  shall  refer  to  as  the 
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translation  frame,  is  not  structured  by  the  axis.  For  example,  the  primitive  spatial 
relations  that  are  defined  by  the  axis,  such  as  the  rotating  wheel's  top  and  bottom,  have 
no  meaning  in  this  environmental  frame. 

A  problem  that  we  consider  to  be  representative  of  those  discussed  under  this 
heading  is  depicted  in  Figure  2.  The  reader  may  wish  to  consult  their  own  first 
impressions  of  the  number  of  revolutions  required  to  traverse  the  line.  Our  own 
introspective  assessment  of  this  problem  is  that  Tots  of  revolutions"  are  required.  In 
fact,  only  two  revolutions  will  bring  the  wheel  across  the  line.  The  manifest  difficulty  in 
solving  this  problem  appears  to  reside  in  the  simple  observation  that  it  is  hard  to  see 
how  the  rotational  motion  is  coupled  into  the  translational  motion.  Cognitively,  the 
rotational  motion  is  one  thing  and  the  translational  motion  is  another  thing  and  what 
the  two  have  to  do  with  each  other  is  not  obvious.  Another  way  of  saying  this  is  that 
the  two  motions  refer  to  different  frames  of  reference  and  the  coupling  between  the 
frames  is  not  perceptually  or  cognitively  transparent 


How  many  limes  will  the  wheel  spin  around 
as  it  roils  along  the  line? 


Figure  2 
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Assessment  of  the  magnitude  of  the  error  is  somewhat  subtle.  Hecht  (1992) 
administered  this  question  to  a  large  group  of  undergraduates  as  part  of  a  mass  testing 
survey.  He  found  that  there  was  about  a  15%  error  in  overestimating  the  number  of 
revolutions.  (No  systematic  bias  was  found  for  judgments  of  "rolling"  squares  and 
triangles  in  which  translational  frames  were  appropriate  to  both  the  object*  and 
environment-centered  reference  frame.)  We  believe  that  Hecht’s  procedure  was  not  as 
sensitive  to  the  difficulty  entailed  in  this  problem  as  it  might  have  been;  he  assessed 
what  people  can  figure  out  given  unlimited  time  and  not  their  first  impressions.  This  is 
an  important  distinction.  Hecht’s  methodology  permitted  the  subjects  to  solve  the 
problem  at  their  leisure.  This  problem  can  be  figured  out;  it  does  not  exceed  the 
capacities  of  college  age  adults.  One  way  to  figure  it  out  is  to  mentally  snip  the  wheel, 
lay  it  out  onto  the  line,  and  evaluate  how  many  copies  of  the  flattened  wheel  are 
required  to  cover  the  line.  We  are  not  interested  in  what  people  can  figure  out;  a 
perception-based  limitation  will  only  be  manifest  if  people  agree  to  disclose  their  first 
impressions. 

We  administered  this  question  to  45  undergraduates  as  part  of  a  class  lecture. 
They  were  instructed  to  answer  as  quickly  as  possible  on  the  basis  of  their  first 
impressions.  These  instructions  were  repeated  several  times.  The  questions  and  figures 
were  then  displayed  for  5  seconds  on  a  large  screen  via  an  overhead  projector.  We 
found  that  there  is  a  strong  bias  to  overestimate  the  number  of  times  the  wheel  will  roll 
along  the  line;  2  is  the  correct  answer  while  the  mean  response  was  3.5.  Our 
methodology  is  subject  to  the  same  criticism  that  we  gave  to  Hecht’s  in  that  we  had 
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little  control  in  imposing  a  deadline  for  response.  In  informal  administrations  of  this 
problem  to  colleagues,  we  have  consistently  found  that  when  harangued  to  give  an 
immediate  answer,  most  people  report  that  S  or  6  revolutions  are  required.  This  is 
quite  close  to  the  number  of  displacements  that  will  cover  the  line. 

A  second  problem  that  will  illustrate  the  issues  discussed  under  this  heading  can 
be  appreciated  by  participating  in  the  following  problem:  Before  proceeding,  cover  the 
next  paragraph  since  the  answer  is  given  there.  Figure  3  shows  two  pennies  in  contact 
with  each  other,  one  placed  above  the  other.  Suppose  that  the  top  penny  is  rolled 
around  the  circumference  of  the  bottom  one,  without  slipping,  until  it  returns  to  its 
original  position.  How  many  revolutions  will  the  top  penny  make  in  its  excursion? 
Answer  quickly,  and  then  take  some  time  to  think  this  problem  through  before 
uncovering  the  next  paragraph. 


This  is  a  hard  problem!  The  answer  is  two.  If  you  got  this  problem  right,  then 
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you  are  exceptional  as  the  vast  majority  of  people  will  say  one  if  prohibited  from  acting 
out  the  solution  with  real  coins.  What  makes  this  problem  hard  is  that  the  rotational 
reference  frame  of  the  top  penny  must  be  coupled  with  the  rotational  frame  defined  by 
the  one  on  the  bottom.  Formally,  this  problem  is  quite  similar  to  the  apparently  simpler 
task  given  above  in  estimating  the  rotations  required  to  produce  a  given  displacement. 
That  is,  the  reference  frame  of  rotation  must  be  coupled  with  a  secondary  frame.  In  the 
first  problem  the  secondary  frame  is  translational,  whereas  in  the  second  problem  it  is 
rotational 

A  similar  demonstration  of  people's  inability  to  deal  well  with  imagjnal 
transformations  involving  multiple  reference  frames  is  seen  in  the  work  of  Pani  (in 
press).  He  asked  people  to  predict  the  appearance  of  a  square  patch,  mounted  through 
its  center  on  a  rod,  after  the  rod  had  been  rotated.  When  the  patch  and  the  rod  shared 
reference  frames  -  the  normal  to  the  patch  coincided  with  the  rod  ~  performance  was 
excellent  When  the  patch  was  mounted  obliquely  on  the  rod  and  the  rod  was  not 
vertical,  performance  was  quite  poor.  Average  errors  were  over  45  degrees. 

Implications  of  the  constraints  for  cognitive 
understandings  of  motion  dynamics 

The  coupling  of  rotation  with  translation  has  dynamical  consequences  for  the 
motions  of  a  rolling  ball.  Hecht  (1992)  showed  that  people  are  extremely  muddled 
about  how  the  spin  of  a  ball  affects  its  trajectory.  As  depicted  in  Figure  4,  the  spin  of  a 
ball  that  is  moving  across  a  planar  surface  (along  the  z-axis)  can  be  decomposed  into 
three  parts  (Walker,  1985;  Whitehead  &  Curzon,  1983).  In  keeping  with  conventions  of 
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billiards,  spin  around  a  vertical  axis  (y)  is  called  English  or  side  English.  Spin  around  a 
horizontal  axis  perpendicular  to  the  motion  (x)  is  called  follow  or  draw,  and  spin  around 
the  axis  of  motion  (z)  is  called  mass.  Given  an  initial  straight  motion  of  the  ball  in  the 
z-axis  direction,  English,  follow,  or  draw  will  not  change  its  linear  trajectory,  but  mass 
will.  Clockwise  mass  will  make  the  ball  curve  to  the  right,  while  counter-clockwise  mass 
makes  it  curve  to  the  left 


Figure  4 

Hecht  assessed  undergraduates’  predictions  about  how  these  three  spins  would 
affect  the  trajectory  of  a  rolling  ball  and  found  that  they  did  not  distinguish  between 
F.nglish  and  mass.  In  particular,  they  predicted  that  follow  and  draw  would  not  affect 
the  ball’s  trajectory,  however,  they  were  overwhelmingly  sure  that  both  English  and  mass 
would  cause  it  to  follow  a  curved  path.  Hecht  created  computer  animations  of  this 


20 


event  and  found  that  people  judged  as  natural  those  paths  that  were  typically  predicted. 

That  is,  a  ball  with  English  was  erroneously  judged  as  appearing  more  natural  when  it 

followed  a  curved  path  than  when  it  followed  a  straight  one.  In  essence,  these  subjects 

seemed  to  assume  that  spin  occurring  in  any  direction  other  than  that  in  which  the  ball 

was  rolling  would  cause  the  ball  to  curve.  Professional  billiards  players  were  also  tested, 

and  they  were  found  to  have  a  correct  understanding  of  the  dynamics  of  spin,  although 

r 

not  surprisingly,  they  could  not  make  quantitative  judgments  about  spin’s  effect 
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Summary  of  Aims  and  Results  during  Funding  Period 

Our  grant  application  proposed  four  distinct  sets  of  experiments.  Significant 
progress  was  made  on  each  and  this  work  is  summarized  below. 

Dynamical  understandings  of  multidimensional  systems.  In  this  area,  seven 
articles  and  chapters  were  published,  are  in  press,  or  have  been  submitted.  In  addition, 
one  doctoral  dissertation  has  been  completed. 

The  work  reported  in  McAfee  and  Proffitt  (1991)  addressed  the  issue  of  why 


many  people  act  as  if  they  do  not  know  that  a  liquid  remains  invariant!/  horizontal 
regardless  of  the  orientation  of  its  container.  It  was  shown  that  erroneous  judgments 
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reflect  different  problem  representations  that  people  are  apt  to  form,  and  that  the 
representation  that  leads  people  to  make  erroneous  judgments  is  evoked  by  a  perceptual 
frames  of  reference  bias. 

Kaiser,  Proffitt,  Whelan,  and  Hecht  (1992)  investigated  the  conditions  in  which 
viewing  animated  displays  leads  to  better  dynamical  intuitions  than  are  evoked  in 
paper-and-pencil  tasks.  They  found  that  animation  is  useful  only  when  the  dynamical 
situation  is  unidimensional.  This  work  confirms  a  basic  tenet  of  our  approach  which  is 
that  the  dynamics  of  multidimensional  systems  are  not  perceptually  penetrable.  These 
studies,  as  well  as  work  supported  by  our  first  AFOSR  grant,  are  summarized  in  Gilden 
(1991,  1993)  and  Proffitt  and  Kaiser  (in  press).  These  theoretical  and  review  papers 
challenge  current  theories  about  people’s  abilities  to  perceive  dynamical  properties.  It  is 
argued  that  people  employ  heuristics  when  evaluating  ongoing  dynamical  systems  and 
that  their  ability  to  extract  relevant  motion  information  is  limited  by  general  principles 
of  perceptual  organization.  A  specific  comparison  of  our  account  with  a  direct 
perception  approach  is  presented  in  Gilden  and  Proffitt  (in  press). 

Hetko  Hecht  (1992)  has  completed  a  PhD.  dissertation  on  work  support  by  this 
grant  This  dissertation  project  investigated  the  understanding  of  rotational  motions  by 
novices  and  professional  billiard  players.  (The  latter  were  recruited  and  tested  in 
Washington,  D.G).  Three  basic  findings  emerged.  First,  when  judging  wheels  rolling 
on  a  horizontal  plane,  observers  who  were  instructed  to  attend  to  the  coupling  of 
rotation  and  translation  can  do  so;  however,  when  judging  the  naturalness  of  wheels 
rolling  down  a  ramp,  observers  disproportionately  focused  on  only  translation.  Second, 
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almost  eveiyone  mistakenly  believes  that  English  (spin  around  a  vertical  axis)  should 
make  a  rolling  ball  curve.  Visual  animation  does  not  improve  performance.  Finally, 
professional  billiard  players  share  some  of  these  misconceptions.  They  were  found  to 
use  procedural  heuristics  to  execute  their  shots  that  do  not  require  adequate  conceptual 
or  perceptual  understandings  about  the  dynamics  of  spin.  These  studies  are  currently 
being  written  up  for  publication  submission. 

Marco  Bertamini  (1992)  completed  and  published  his  Masters  thesis  on  an 
investigation  on  memory  representations  for  position  in  dynamical  contexts.  He  found 
that  when  shown  a  static  image  of  a  ball  located  on  an  inclined  plane,  memory  for  the 
ball’s  position  is  displaced  downward. 

Learning  to  evaluate  dynamical  systems.  Gregory  Kean  completed  his  Master's 
thesis  with  Gilden  on  an  investigation  of  the  ability  to  judge  differences  and  ratios 
within  static  and  kinematic  variables.  He  found  that  both  processes  of  judgment  exist 
independently  for  translation  speed,  rotation  speed,  numerosily,  size,  and  angular  extent 
Each  judgment  type  satisfied  the  axioms  for  the  representation  and  uniqueness 
theorems  to  infer  the  existence  of  two  independent  algebraic  difference  structures. 

There  was  additional  evidence  that  these  judgments  are  linked  appropriately  to  infer 
that  these  quantities  are  measured  perceptually  on  ratio  scales.  These  results  provide 
strong  evidence  that  failures  in  dynamical  understandings  arise  when  comparisons  are 
made  across  stimulus  dimension  as  judgments  are  adequate  within  single  dimensions. 
This  work  is  being  prepared  for  publication. 

The  research  discussed  above  with  professional  billiard  players  also  relates  to  the 


26 


work  proposed  in  this  section.  Hecht’s  dissertation  work  showed  that  novices  could  be 
trained  within  about  15  minutes  to  judge  the  rebound  trajectories  of  balls  spinning  with 
English  and  perform  at  this  task  as  well  as  could  the  professionals.  This  finding 
indicates  again  that  perceptual  competencies  in  this  domain  are  not  particularly 
complex. 

Path  perception  in  both  apparent  and  continuous  motions.  Hecht  and  Proffitt 
(1991)  found  that  the  apparent  motion  of  an  object  that  undergoes  an  orientation 
change  in  depth  is  resolved  by  a  perceived  curved  trajectory  in  depth.  This  work 
supported  the  prediction  made  in  the  grant  proposal  that  apparent  motions  are 
constrained  by  kinematic  as  opposed  to  dynamic  constraints. 

Basic  issues  in  motion  information  processing.  The  experiments  proposed  in  this 
section  were  completed  and  four  articles  have  been  published  or  are  in  press.  Proffitt, 
Rock,  Hecht,  and  Schubert  (1992)  reported  a  set  of  studies  on  perceiving  depth  from 
the  stereokinetic  effect  It  was  found  that  the  stereolrinetic  effect  -  an  illusion  -  is 
symptomatic  of  the  perceptual  processes  that  derive  depth  from  small  rigid  object 
rotations.  Similarly,  Caudek  and  Proffitt  (1993)  showed  that  the  stereokinetic  effect 
evokes  the  same  perceptual  response  as  does  appropriately  matched  motion  parallax 
displays.  A  general  model  for  perceiving  depth  from  monocular  motion  information  is 
presented  in  these  two  works.  In  essence,  it  is  argued  that  the  perceptual  system 
extracts  only  a  subset  of  the  motions  present  in  optical  flow  and  combines  this  with 
inherent  perceptual  biases.  Schmuckler  and  Proffitt  (in  press)  showed  that  infants 
respond  to  stereokinetic  effect  displays  in  a  manner  suggesting  that  they  perceive  depth. 


27 


Finally,  Kaiser  and  Proffitt  (1992)  how  stereokinetic  displays  could  be  employed  to 
reduce  the  computational  resources  required  to  create  depth  impressions  in  moving 
displays. 


Conclusions 


Our  ability  to  perceive,  remember,  imagine,  and  reason  about  motions  is  related 
to  the  mathematical  constraints  that  are  required  to  represent  different  kinds  of  motions 
and  to  physiological  constraints  that  exist  in  motion  processing.  These  constraints  are  of 
both  a  mathematical  and  physiological  nature. 

Mathematics: 

1.  The  representation  of  rotation,  divergence,  and  shear  motion  fields  requires 
the  specification  of  spatial  layout  sufficient  to  characterize  axes  or  lines  at 
specific  positions  in  the  optic  array.  Only  translation  can  be  represented  without 
reference  to  spatial  layout  A  translating  body  can  be  treated  as  a  point  particle, 
all  other  motions  entail  that  objects  be  treated  as  extended  bodies. 

2.  Rotation  does  not  in  particular,  generate  a  globally  ordered  sequence  of 
displacements.  More  rotation  does  not  always  lead  to  perceptually  larger  angles. 
Angle  accumulates  perceptually  modulo  180°. 

Physiology: 

Direction  selective  cortical  motion  detectors  are  specific  to  translation  motions 
prior  to  the  medial  striate  temporal  (MST)  area.  Early  neural  hardware  is 
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designed  to  extract  translation  vector  fields. 

These  constraints  make  translation  a  special  case  in  cognitive  and  perceptual 
processing.  The  uniqueness  of  translation  was  investigated  broadly  within  four  areas  of 
research. 

1.  Attention.  Translation  is  processed  preattentively,  whereas  other  motion  fields 
require  focused  attention.  This  difference  arises  from  the  requirement  that  a  geometric 
element  in  spatial  layout  be  specified  for  all  nontranslational  motion  fields,  and  the 
positioning  of  a  geometric  element  requires  focused  attention. 

2.  Memory.  Translational  motion  is  preferentially  encoded  and  is  therefore 
remembered  better.  Hiis  preference  arises  for  two  reasons;  a)  rotations  do  not 
accumulate  as  do  translations,  they  are  bounded  and  repetitive,  b)  Rotations  and 
translations  require  different  object  representations,  as  extended  bodies  and  point 
particles,  respectively.  Kinematic  analysis  proceeds  primarily  on  the  basis  of  a  point 
particle  representation;  i.  e.  in  terms  of  where  the  object  as  a  whole  went 

3.  Imagination.  Rotations  are  harder  to  mentally  manipulate  than  translations. 
The  inherent  incompatibility  of  rotational  and  translational  representation  has 
implications  for  the  ease  with  which  they  can  be  manipulated  by  thought 

4.  Reasoning.  Rotations  are  harder  to  understand  dynamically  than  translations. 
The  coupling  of  rotation  and  translation  has  dynamical  consequences  for  rolling  objects. 
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